Compounds in dictionary-based cross-language information retrieval
نویسنده
چکیده
Compound words form an important part of natural language. From the cross-lingual information retrieval (CLIR) point of view it is important that many natural languages are highly productive with compounds, and translation resources cannot include entries for all compounds. Also, compounds are often content bearing words in a sentence. In Swedish, German and Finnish roughly one tenth of the words in a text prepared for information retrieval purposes are compounds. Important research questions concerning compound handling in dictionary-based cross-language information retrieval are 1) compound splitting into components, 2) normalisation of components, 3) translation of components and 4) query structuring for compounds and their components in the target language. The impact of compound processing on the performance of the cross-language information retrieval process is evaluated in this study and the results indicate that the effect is clearly positive.
منابع مشابه
Clef Experiments at Maryland: Statistical Stemming and Backoo Translation
The University of Maryland participated in the CLEF 2000 multilingual task, submitting three oocial runs that explored the impact of applying language-independent stemming techniques to dictionary-based cross-language information retrieval. The paper begins by describing a cross-language information retrieval architecture based on balanced document translation. A four-stage backoo strategy for ...
متن کاملImproved Cross-Language Retrieval using Backoff Translation
The limited coverage of available translation lexicons can pose a serious challenge in some cross-language information retrieval applications. We present two techniques for combining evidence from dictionary-based and corpus-based translation lexicons, and show that backoff translation outperforms a technique based on merging lexicons.
متن کاملCLEF Experiments at the University of Maryland: Statistical Stemming and Back-off Translation Strategies
The University of Maryland participated in the CLEF 2000 multilingual task, submitting three o cial runs that explored the impact of applying language-independent stemming techniques to dictionary-based cross-language information retrieval. The paper begins by describing a cross-language information retrieval architecture based on balanced document translation. A four-stage backo strategy for i...
متن کاملArabic/English Cross Language Information Retrieval Using a Bilingual Dictionary
With the increase of multilingual information available online and the increase of non-native English speaker (Arabic users) browsing the Internet, it has become more important to have information retrieval systems that can carry the retrieval process across language boundaries that is, cross language information retrieval CLIR systems. The CLIR system responds to the user query in a comprehens...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Res.
دوره 7 شماره
صفحات -
تاریخ انتشار 2002